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We present numerical results for various information theoretic properties of the square lattice 
Ising model. First, using a bond propagation algorithm, we find the difference 2Hl{w) — H2l{w) 
between entropies on cylinders of finite lengths L and 2L with open end cap boundaries, in the limit 
L — >■ CX3. This essentially quantifies how the finite length correction for the entropy scales with the 
cylinder circumference w. Secondly, using the transfer matrix, we obtain precise estimates for the 
information needed to specify the spin state on a ring encircling an infinite long cylinder. Combining 
both results we obtain the mutual information between the two halves of a cylinder (the "excess 
entropy" for the cylinder), where we confirm with higher precision but for smaller systems results 
recently obtained by Wilms et al. - and we show that the mutual information between the two 
halves of the nng diverges at the critical point logarithmically with w. Finally we use the second 
result together with Monte Carlo simulations to show that also the excess entropy of a straight 
line of n spins in an infinite lattice diverges at criticality logarithmically with n. We conjecture 
that such logarithmic divergence happens generically for any one-dimensional subset of sites at any 
2-dimensional second order phase transition. Comparing straight lines on square and triangular 
lattices with square loops and with lines of thickness 2, we discuss questions of universality. 

PACS numbers: 05.50+q, 75.10.Hk, 89.70.Cf 



I. INTRODUCTION 

Although the two-dimensional Ising model is one of 
the best studied models and can be solved exactly, there 
are still some questions about it which are not yet set- 
tled. These concern in particular problems of information 
theoretic nature which have become important recently 
in the broader context of quantum critical phenomena, 
where the classical Shannon information has is related 
to the von Neumann entropy, and the mutual entropy is 
related to the entanglement entropy. 

To mention just one recent result (which actually trig- 
gered the present study), it was shown by Wilms et al. 
[Tj that the mutual information (MI) between two halves 
of an infinite cylinder has a maximum that seems to be- 
come sharper with increasing circumference w, but this 
maximum is not at the critical temperature but at a tem- 
perature Tmax > Tc which does not seem to converge to 
Tc when w — oo. This result, obtained by a sophisticated 
Monte Carlo method, is highly surprising, as we expect 
any singularity to occur only at Tc. One of the purposes 
of the present paper is to check this by a complementary 
method, and to provide a simple and rigorous proof that 
the height of this maximum is < 1 bit per spin, for any 
T. What diverges at criticality is not the value of the 
MI, but its derivative with respect to T. 

On the other hand, studying information theoretic 
quantities in the Ising model has a long history, with 
rather unclear results so far. The first study relevant for 
us was made in 1984 by R. Shaw [5], who studied the 
Shannon information needed to specify the spin configu- 
ration on a line of n spins in an infinite 2-d lattice. Away 
from Tc one expects this to be linear in n, 

Hn/n — )■ const for n — >■ oo. (1) 



Furthermore, one expects the "excess entropy" [2| or "ef- 
fective measure complexity" [3], defined as the MI be- 
tween the two halves of this line as 

E - lim 2Hn~H2n, (2) 

n— >-C30 

to be finite. Due to the long range correlations at Tc, it 
is not clear whether the latter still holds at the critical 
point. Several simulations [H HHS] suggested that the 
excess entropy increases sharply when T \, Tc, but stays 
finite at Tc. A second purposes of the present paper is to 
show that £ diverges logarithmically with n. Indeed, the 
MI between the two halves of a ring encircling a cylinder 
show the same logarithmic divergence. The coefficient 
of the logarithmic term is universal with respect to the 
lattice type (square vs. triangular), but depends on the 
geometry of the line (straight line vs. topologically trivial 
loop on a plane lattice). It agrees numerically with the 
result obtained in [7] for the ground states of quantum 
Ising chains. 

Together with these main results, we obtain two more 
technical results: (i) By re-analyzing the numerical re- 
sults of [5] we obtain a more precise estimate of their 
universal constant ri (called Tc in the present paper). 
And (ii) by obtaining transfer matrix results for widths 
up to w = 29 we check the universality of ri with higher 
precision. 

The rest of the paper is organized as follows: In Sec. 2 
we recall some basic facts about mutual informations and 
Markov chains. In Sec. 3 we use a bond propagation 
algorithm (BPA) [5HTT] to calculate the entropy of a long 
cylinder, from which we then isolate the contribution due 
to the open boundary conditions at its two ends. In Sec. 4 
we present results from a transfer matrix calculation and 
combine them with the results from Sec. 3 to obtain the 
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scaling properties of the MI studied in [T]. We also obtain 
there precise estimates for Tc and for the Shannon entropy 
h(. per site in an infinitely long line of spins. Mutual 
informations between two halves of a ring are studied in 
Sec. 5. Extensive Monte Carlo simulations (using Wolff's 
algorithm [12]) are finally used in Sec. 6, together with the 
value of he obtained in Sec. 4, to show that the excess 
entropy for such a line of n spins in an infinite system 
diverges logarithmically at Tc. Questions of universality 
for this divergence are discussed by studying also other 
subsets of spins that are one-dimensional in the limit n — > 
oo. The paper finishes with conclusions in Sec. 7. Several 
technical aspects are discussed in three appendices. 



II. MUTUAL INFORMATION 

In this paper we shall only deal with classical Shan- 
non information theory (13) . Given an alphabet S = 
{0, 1 . . . fc — 1} of fc letters and probabilities pi for the 
i-th letter to occur, the entropy is defined as H — 
— 2^ j^o Pi log Pi I with the logarithm to base 2 if the en- 
tropy is to be measured in bits. The following alphabets 
will be used in this paper: 

• The binary alphabet A = {0,1} or A = {-,+}, 
called spin s. The concatenation of n such spins 

1S2 ■ ■ ■ s„) forms a string, and its entropy of 
iS„ will be denoted by i7„ and will be called a block 
entropy. 

• Each string of spin can itself be considered as a 
"letter" in an alphabet A^ = {-,+}" of size 2". 
For example, the alphabet set of a spin pair is 

A2 = { , — |-,H — ,-!-+}. We can then, just as 

we did above, concatenate L such new letters to 
form a string, which then actually is a rectangular 
array of size n x L. If we have this in mind, we 
denote letters € An, i.e. n-tuples of spins, as s'"' = 
(si, S2j • ■ • J Sn)- The corresponding rectangular ar- 
ray is then denoted as iSj^' = {s^"^S2^'^ . . . s^'). 

• In particular we shall consider strings of length n = 
w, where w is the width of the lattice. We will 
assume that the lattice is periodic laterally, i.e. w 
is actually the circumference of a cylinder, and s'™! 
can be viewed as forming a ring. The entropy of L 
adjacent such rings, i.e. of a rectangular wxL array 
with periodic b.c. in the w-direction and open b.c. 
in the L-direction will be called Hl{w). 

For any joint probability distribution over a product 
of alphabets A and B the MI is defined as [T3] 

I{A : B) = H{A) + H{B) - H{AB) = H{A) - H{A\B) 

P^J 



E 



Pi J log 



PiPj 



(3) 



It is equal to the average decrease in code length needed 
to specify the value i in a random realization, if the value 



of j gets known and if encoding is done optimally for 
many such independent realizations jointly. 

If A in Eq.Q is a set of length-n strings and B the set 
of single letters, then 



H{AB\A) = H{Sn+i) - HiSn) = H, 



71+1 



Hn (4) 



(with _ffo — 0) is the information needed to specify the 
last one in a string of n -I- 1 letters, given all previous 
ones. If the source generating the string is ergodic and 
the limit exists, then 



lim h„ 



(5) 



is called the entropy per letter of the source, or simply the 
entropy per letter. If, moreover, the probability distribu- 
tion is stationary then /i„ is monotonically decreasing 
with n, since the difference 



5hn — hn-l — /in 



(6) 



can be interpreted as the amount by which the uncer- 
tainty about the last of n-l- 1 letters decreases, if the first 
one gets known (and all intermediate ones are known 
already) p]. An alternative interpretation of 6hn is as 
a conditional mutual information ,13., Shn = I{sn+i ■ 

Si\s2 ■ ■ ■ Sn). 

Of particular importance are Markov chains. A 
Markov chain of order k is characterized by 



Ps„\s 



'■Ps„\s 



(7) 



i.e. the memory is of length < k. Notice that this does 
not imply that there are no longer ranging correlations, 
but they are all mediated by a chain of short range steps. 
For a Markov chain of order k it is easily seen that [3] 



6h„ 



for n > fc 



(8) 



Thus for a first order Markov chain Shi = 2-ffi — i?2 > 0, 
while Shn = for n > 2. 

Notice that this notation assumes that the chain is in 
its stationary state, i.e. Hi does not refer to the entropy 
of the first letter(s), if there is a transient. But the basic 
result is more general. Consider e.g. a heterogeneous 
and non-stationary first-order Markov chain A-B-C-V. 
Then 



I{AB:CV)^ I{B:C). 



(9) 



This allows an immediate generalization to Markov fields. 
A (first-order) Markov field is a graph with random vari- 
ables at the vertices, such that any two subsets of nodes 
AtC become independent, if one conditions on a sepa- 
rating set B (the set B separates A and C, if every path 
connecting the latter has to pass through B). Consider 
now a splitting of the entire graph into two disjoint sub- 
sets A,C. Furthermore, divide each subsets into its in- 
terior and its boundary , where the latter is the set of 
nodes with links to the other subset. This defines then 
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a Markov chain Ao-dA-dC-Co, where the subscript "0" 
indicates the interior and "9" indicates the boundary. 
Eq. ^ gives then that the MI between any two subsets 
is equal to the MI between their boundaries, 



I{A : C) = I{dA : dC). 



(10) 



and thus smaher than the entropy of either boundary. 

For a bi-infinite string S = {. . . s^iSqSi . . .) the excess 
entropy [2j or effective measure complexity [3] is defined 
as the MI between the left half = {. . . s^iSq) and the 
right half = (siS2 . . .) or, equivalently, as 

oo oo 

£:(5) = ^(/i„-/i) = ^<5/i„. (11) 

n— n— 1 

For a first-order Markov chain we have simply 



£{S) = Shi 



(12) 



i.e. the MI between left and right halves is just equal to 
the MI between two neighboring letters. 

A first application of this to the 2-d Ising model on an 
infinitely long strip of width w (with either periodic or 
open lateral boundary condition) is that the excess en- 
tropy per spin is equal to the MI between two adjacent 
lines (resp. rings) of w spins, and is thus bounded by w 
bits, because the transfer matrix generates a first order 
Markov process. This explains immediately why the MI 
per width measured in [T] is finite for all T, even in the 
limit w — > cx) and T — Tc- In order to compute this 
excess entropy explicitly, we need two ingredients: Be- 
cause Shi ^ hi — ho , we need both the unconditional 
information ho ~ for a ring and the information 
hi = H2{w) — for a ring conditioned on one of its 
neighbors. The first one will be computed by the trans- 
fer matrix (TM), and the second by the bond propagation 
algorithm (BPA) We will show that the combination of 
both algorithms allows to obtain mutual informations for 
up to w « 30, about twice the size feasible on a worksta- 
tion with the TM alone. 



III. CYLINDER ENTROPIES OBTAINED BY 
BOND PROPAGATION 

The bond propagation algorithm (BPA) for the Ising 
model O HO] is a modification of a similar algorithm de- 
veloped by Frank and Lobb [Mj for finding the resistance 
of a 2-d resistor network. In contrast to transfer matrix 
methods it cannot be used to obtain probabilities of spin 
configurations (such as those needed to calculate the en- 
tropy of a single ring), but it is the most efficient and 
accurate method known so far for calculating free and 
internal energies of finite 2-d systems [TT]. It was used 
until now only for open lateral b.c, but as pointed out in 
[TT] it can also be adapted to periodic b.c. in one direc- 
tion (but not in both). The basic strategy is described 
in [13] , and details are given in ^ [TUl [H] . 



We implemented the BPA for the Ising model with 
cylindrical b.c. with sizes L x w. Although we are inter- 
ested in the limit L/w — > oo, we found that L = lOw is in 
general enough to obtain results precise up to machine 
precision. For w < 20 we checked our results also by 
means of a microcanonical transfer matrix [161 117] that 
is very accurate in our implementation. 

In the limit L, u" — > oo one obtains of course Onsager's 
result [H] which reads, at inverse temperature j3 = f5c — 
ln(l + y2)/2 = 0.44068679 . . ., 



he lim 
The values 



Lw 



= 0.442142977 . . . bits . (13) 



h{w, (i) = lim 



Lw 



indeed converge at /3 



h{w,l3c 



0.2618 



to h as 
0.15 



0.5 



(14) 



(15) 



This conforms with the result of [TH] [20] that the free 
and internal energies per site are power laws in XjuP' . 

More interesting for us, however, is the detailed 
convergence with L. We assume that the limit 
Xvceii^^aaiiiL(w,jS)lw — h(w,/3)L) exists. In this case we 
can calculate it by comparing two cylinders of length L 
to one cylinder of length 2L, and obtain 



lim [2HL{w,P)-H2Liw,l3)]. 

L-^oo 



(16) 



We found that this limit indeed converged very rapidly. 
Away from the critical region A{w,/3)/w also stays 
bounded for — >■ oo, but not near /? = /3c, as seen from 
Fig.[l] 

A more detailed study shows that both the peak height 
in Fig. [ij and the distance of the peak position from /3c 
scale like powers of w (see Figs. 2|3 1, 



max[A(it;, /?)] 



w 



1.18±0.03 



(17) 



/3« 



arg max[A(u', /3)] 

/3 



/3c 



,-0.98±0.03 



(18) 

The errors are here rather large due to large corrections 
to scaling. 

From an information theoretic point of view, 
A{w,/3)/'w has two contributions. On the one hand, the 
system consisting of two independent cylinders of length 
L misses all horizontal bonds in the center of the cylin- 
der of length 2L which connect the left and right halves. 
This contributes a term which basically measures the ef- 
fect of the free end boundary condition as compared to, 
say, periodic boundary conditions. But this is not the 
only effect. Even if these bonds were present, the infor- 
mation needed to specify the two cases would differ by 
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FIG. 1. (Color online) The quantity A(w,/?) defined in 
Eq. (161, for critical Ising systems of finite widths w, plot- 



ted against /3. The vertical dotted line is the critical /3c 



FIG. 3. (Color online) Log-log plot showing the scaling of the 
peak positions of Fig. [T] versus w. The straight line has slope 
—0.98, indicating that /3™,max — /3c ~ w""'^*. 




W 

FIG. 2. (Color online) Log-log plot of the peak heights of 
Fig. [l] i.e. of the maximal values of A(w,/3), versus w. The 
straight line has slope 1.18. 

the MI between the left and right halves, i.e. by the ex- 
cess entropy. As we have already pointed out in Sec. 2, 
this excess entropy (per unit of w) is always bounded, 
so that the first contribution alone is responsible for the 
divergence. 



IV. SHANNON INFORMATION OF A SINGLE 
RING AND EXCESS ENTROPY OF A 
CYLINDER 

Since any transfer matrix of the Ising model on a finite 
width strip induces a Markov chain, the entropy wh{w, P) 
per ring studied in the previous section is just the condi- 



tional Shannon entropy of this ring, conditioned on the 
previous one, i.e. in the notation of Sec. 2 

h{w, /3) = {H2{w, p) - H,{w, P))/w (19) 

(notice again that we assume stationarity, i.e. Hi{w,(3) 
refers to a single ring far from the ends of the cylinder). 
To skip the transient state, this quantity can be calcu- 
lated by wh{w,l3) = \\mL^oo[H l+i{w , (i) - Hl{w,I3)] 
using the results obtained with the BPA in Sec. 3. 
In order to obtain the excess entropy £{w) :— 

ImiL^oo £^ (S^L^) = 2Hi{w) — H2{'w), or mutual informa- 
tion in this case, of the infinitely long cylinder, we need 
in addition values of Hi{w), i.e. of the unconditioned 
Shannon entropy of this line. This is obtained from a 
conventional transfer matrix (TM) calculation. Details 
of this calculation are given in Appendix A. At /3 = /3c 
we obtained data for w up to 29, at a few selected points 
away from criticality up to w = 28. 

In principle one can obtain in this way also the Shan- 
non entropy H2 for two adjacent rings. This would, how- 
ever, require to estimate all probabilities over 2^™ spin 
configurations. With present workstations this can be 
done only for w < 16. On the other hand, the BPA or a 
similar scheme can find the entropy of the whole lattice 
up to much larger w - but it cannot give the entropy 
of a single ring. By using the Markov chain property 
and combining the BPA with the TM, we can therefore 
compute the exact numerical excess entropy for w up to 
« 30. 

We first checked that our TM data were indeed consis- 
tent with the conjecture of Stephan et al. [U [2T] 

Hi{w,l3) = h{p)w^r{(i) + o{\) (20) 
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FIG. 4. (Color online) Shannon entropies (in nats) for a ring 
encircling a cylinder of width w, after subtracting the leading 
term cx w. 



for w — >■ cxj, where 

for P<Pc 

r{p) ^ { r^:= 0.2543925(5) for /3 = /3c 

ln(2) for /3>/3c 



(21) 



while ft.(/3) is the Shannon entropy per spin for an in- 
finitely wide cylinder. For this we plot in Fig. |4] the dif- 
ferences Hi{w,/3) — h{f3)w for three values of /? against 
w, where /i(/3) is chosen such that the curves become flat 
for w — >■ (X). We find perfect agreement. Moreover we 
see that the leading corrections for large w are ~ l/w for 
/3 = /3c, - l/w^ for /3 < /3c, and - l/w^ for /3 > 13^. 

Although Fig. |4]is sufficient to verify Eq.(21|, it does 
not do justice to the very high precision of the transfer 
matrix data of 0. In particular, we want for Sec. 5 a 
much more precise estimate of the entropy per spin for 
the critical case. Therefore we first re-analyzed the data 
of [2 to obtain a more precise estimate 

Tc = 0.254392505(10) nats = 0.367010805(14) bits. 

(22) 

Details are given in Appendix B. After that, universality 
is invoked to use this value as a constraint in a similar 
analysis, in order to obtain 

h{Pc) = 0.37692626(7) nats = 0.54378965(10) bits. 

(23) 

Again details are given in Appendix B. 

After having verified the correctness of the algorithm, 
we now proceed to obtain the excess entropy £{w,/3) = 
Hi{w,/3) — wh{w,/3). The data is given in Fig. [i] The 
overall picture is shown in the inset, while the main fig- 
ure shows an enlargement close to the critical region. We 
indeed verify the qualitative behavior found in [1]. In 
particular, we find that the curves for w > 8 have peaks 
in the region (3 < Pc, and that the peak positions shift 
to smaller values of /3 as w is increased. As seen from 




0.120 - 



FIG. 5. (Color online) Excess entropy £{w,f]) (in bits per 
unit width) between two halves of a cylinder. From top to 
bottom (at the critical temperature), the curves correspond to 
w = 12, 14, 16, 18, 20. The inset shows the golbal behavior for 
w = 6,8, 10, 12, 14. For w < 6 the curves are monotonically 
increasing. 



Fig. |6j the peak heights first decrease with w and reach 
a minimum at w = 15. They then increase again, but 
the rate of increase slows down for w > 21 (see inset of 
Fig. |6]). The peak positions (Fig. [7| show a similar be- 
haviour, and reach a minimum at w = 21. Extrapolating 
these data to w = oo is not easy, but our best estimates 
are 0.417(2) for the position and 0.134(2) for the height. 
Both are consistent with the less precise results of Wilms 
et al. [T| obtained for larger systems. While the peaks 
initially get sharper with increasing w, their widths soon 
seem to become independent of w. All this indicates that 
these peaks have no close relationship with criticality. 

On the other hand, the slopes (obtained by numerical 
differentiation) of the mutual information in Fig. [5] seem 
to diverge to — oo near /3c, as seen from Fig. [Sj A more 
detailed analysis (Fig. |9]) shows that the minimum of the 
slope (i.e. the inflection point) moves towards /?c ac- 
cording to a power law Pw^min — Pc ^ ^-i.39±o.08^ where 
/3iu,min = &rgTtvaip[dE{w , /3)/d/3]. Moreover, a linear fit of 
the minimum value is also shown in the figure, suggesting 
a power law 



-m]n\d£(w,(i)ldl3] - — 



(24) 



where a equal to or slightly less than 1. The uncertainties 
in both exponents are large due to large corrections to 
scaling. All this suggests strongly that it is the slope 
of the mutual information, not the mutual information 
itself, that diverge at the critical point. 
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FIG. 6. (Color online) Peak heights (in bits) of the excess 
entropy in Fig. [5] A minimum and an inflection point are 
located at u; = 15 and w = 21 respectively. The inset shows 
the derivative of the height with respect to w. 
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FIG. 7. (Color online) Peak positions of the excess entropy 
in Fig. [5] Again the inset shows the slope of the curve. 



FIG. 8. (Color online) Slope of the excess entropy 
d£{w,P)/d(3 per width in Fig.js] 
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FIG. 9. (Color online) Log-log plot of the positions of the 
minimum in Fig. [S] The straight line has slope -1.39. The 
inset shows the value of the minimum in Fig. [S] on a linear 
plot. The straight line indicates a power 



MUTUAL INFORMATION BETWEEN 
PARTS OF A SINGLE RING 



The fact that the MI between the two halves of a cyhn- 
der does not diverge, demonstrated in the previous sec- 
tion, does not tell anything about Mis between parts of 
the ring separating the cylinder halves. Let us divide a 
ring of length w into two halves of lengths m and w ~m. 
The transfer matrix calculations of the previous section 
allow us also to compute the Shannon entropies of these 
parts, and hence of the MI between them. 

Data for w — 28 a.t (3 — f3c are shown in Fig. 10 



Together with the MI we show there a fit 



MI ^a + b' ln(— sm ) 

TT W 



(25) 



as suggested in fT' for periodic chains of w spins in the 
ground state of the quantum Ising model in a transverse 
magnetic field (see also |2,2!). We see that the fit is not 
perfect (the points for m = 1 and m = w — 1 clearly de- 
viate from it), but the overall agreement is surprisingly 
good. The constant a is obtained as a = 0.380(1), which 
is clearly different from the value 0.329 found in [J, sug- 
gesting that a is not universal. In order to compare 6', 
we first plot the maximal MI (for m = w/2) against w, 
see Fig. 11 We see clearly a logarithmic increase. Fit- 



7 




FIG. 10. (Color online) Mutual informations between two 
parts of a ring of m) = 28 spins encircling a cylinder (in natural 
units), plotted against the length m of one of the parts. The 
continuous line corresponds to Eq. (1251). 
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FIG. 11. (Color online) Mutual informations between two 
equal halves of a ring of w = 2m spins (in natural units), 
plotted against logw. The detailed form of the fit is presum- 
ably not to be relied upon, but the coefficient of the term 
linear in logw seems to be robust. 



ting the values for even and odd w with correction terms 
oc 01 1/w gives the estimate 



b' = 0.1201(2) nats, 



(26) 



corresponding to 0.1733(3) bits. Within less than 1% this 
agrees with the value found in [7j. 

Thus we conjecture that b' is universal It should be 
related to the universal constant Vc discussed in the pre- 
vious section, but the detailed relationship is not clear. 
Further questions of universahty are discussed in the next 
section. 
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FIG. 12. (Color online) Convergence of h„, the entropy per 
spin in a line of length n embedded in lattice of size L x L 
with helical (i.e. practically periodic) b.c. On the a;-axis is 
plotted 1/n, so that a straight line indicates a convergence 
h ^ h + b/n. The straight line passes has h — 0.54378 . . . 
as obtained for rings encircling cylinders. The inset shows the 
data after subtraction of this linear part. 



VI. ENTROPIES OF LOOPS AND 
OPEN-ENDED STRINGS FROM MONTE CARLO 
SIMULATIONS 

Finally we wanted to see whether the entropy of a set 
of n spins embedded in an infinite 2 — d lattice shows any 
anomalous behavior at the critical point, when n — >■ oo. 

Let us look first at straight lines of spins. We expect 
of course that h = lim„_>.oo hn is the same as the entropy 
per site in an infinitely long ring encircUng a cyhnder. 
But it is a priori not clear whether the excess entropy - 
i.e. the MI between two halves of the line also diverges 
when n — > cx) and T — >■ Tc- 

Previous studies have all found that the excess 

entropy £ = £{Sn) has a maximum near T — Tc |23j . 
but none did find any divergence. All these studies used 
Monte Carlo (MC) simulations. In view of the difficulties 
measuring £ precisely in such simulations, this should not 
be taken as a real proof that £ remains finite. 

These difficulties and our strategies used to overcome 
them are detailed in Appendix C. The final result is 
shown in Fig. 12 where we plotted /i„ versus 1/n, for 
three lattice sizes (L = 4096, 16384, and 65536). The 



straight line is such that it crosses the y-axis at h = 
0.543789 ... as determined in the previous section, and 
becomes tangent to the L — 65536 data extrapolated to 
1/n — > 0. It has a slope of 6 = 0.085(1). This means that 



i7„ w n/i + 6 Inn bits, = 0.085 ± 0.001. 



(27) 



When compared to the results of the previous section, 
we see that b is very close to 6'/2 (within one standard 
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FIG. 13. (Color online) Convergence of h„ for a line of spins 
embedded in a triangular lattice. In contrast to Fig. |12| now 
the slope of the straight line fit is fixed (to the value 0.085 
obtained for the square lattice), but its intercept is fitted. 
From the latter we obtain h = 0.5839(3) for the triangular 
lattice. 



FIG. 15. (Color online) Analogous to Fig. 12 but for square 
loops instead of straight lines. Here, n = 4fc is the length of 
the loop (fc = 1,2,... 6). As in Fig. 12 the intercept of the 



straight line is fixed, while its slope is fitted to the large-n 
behavior for the largest value of L. 
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of b, we did not perform any transfer matrix calculations. 
Thus the fit shown in Fig. [13] is not constrained to pass 
through a precisely known value of h for 1/n — 0, in con- 
trast to that in Fig. [T2j Yet, the result clearly suggests 
universality as regards the type of lattice. 

An indication that universality holds even more gen- 
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erally comes from looking at lines of width 2. In Fig 
we show the entropies /i„ (in bits) when the "alphabet" 
is a pair of spins s^^l (so the string is of the form Sn^), 

rnl 

and Hn — {Sti ) is the entropy needed to specify the 
spin configuration on a 2 x n rectangular block. In the 
limit 71 — >■ cxD this block becomes 1-dimensional, and 



FIG. 14. (Color online) Convergence of hn for an n x 2 rect- 
angular block of spins embedded in a square lattice. The 
straight line has the same slope as in Fig. |12[ and its inter- 
cept is also given by the results of the previous sections. Thus 
it involves no new fitting parameter. 



deviations). The relationship 6' — 2b would be quite 
plausible, given the fact that Shm is non-zero both for 
m = 0(1) and for w — m = 0{1). It can be proven ex- 
actly for entanglement entropies in quantum spin chains 
at T = [22 , but it is not clear whether the proof holds 
also in the present case. 

On the other hand, it seems natural to conjecture that 
b is universal. To test this, we performed analogous sim- 
ulations also for the triangular lattice. The Monte Carlo 
simulations used the same system sizes and had the same 
statistics. But, since we only wanted to check whether 



lim hn = lim w H2{'w,l3c) 



(28) 



0.442143 + 0.543790 = 0.985933 bits. 



Thus we can make a constraint fit as in Fig. |12[ the result 
of which is shown in Fig. [T4j We see that there are now 
much larger corrections to scaling (as expected), but a 
decent fit (with no new parameters!) is obtained with 
the same b as found in the above two cases. 

Consider finally a set of n spins, not forming necessar- 
ily a straight line. An extended conjecture of universality 
would be that the value of b depends only on the gross 
geometric shape of this set. As a first test that different 
shapes give rise to Eq. (27 1 but with different values of 5, 

Ak spins on the square lattice 
Although 



the data are also fitted by Eq. (27) with the same value this case 



we considered loops of n 

formed by four straight legs of length k each, 
corrections to scaling are again large, the data shown in 
Fig. 15 clearly suggest that Eq. (27) holds, with 6 < in 
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VII. CONCLUSION 

In the present paper we have studied quantities related 
to Shannon entropies needed to specify the states of vari- 
ous sets of spins. A natural extension of this work would 
be the study of Renyi entropies, as e.g. in 8J, another 
one to different models like the Potts or Blume- Emery- 
Griffiths models [24j. 

Some of the entropy measures studied (like, e.g., the 
average entropy he given in Eq. (13)) are related to the 
thermodynamic entropy, but not all. The reason is on the 
one hand that we studied several Mis that have no direct 
counterpart in thermodynamics. On the other hand, ev- 
ery Shannon information has to be understood relative 
to some conditioning, and these conditionings differ for 
the different entropies studied above. The very concept 
of MI is closely related to this, as the MI between A and 
B is just the difference between the unconditional infor- 
mation needed to specify A and the information needed 
to specify A, conditioned on knowing B. 

In cases where one wants to describe the state of a 
bi-infinitely long string of "letters" , the MI between the 
two halves of the string is called its excess entropy. This 
is the quantity most directly involved, if we want to 
know whether the entropy is strictly extensive or devi- 
ates from extensivity due to long range correlations. The 
2-dimensional Ising model was studied because it devel- 
ops such long range correlations exactly at the critical 
point. It is thus natural to expect strong corrections to 
extensivity exactly at the critical point. We showed that 
this is indeed the case: The information needed to de- 
scribe the states of contiguous strings of n spins contains 
in general a term oc Inn. The amplitude in front of this 
term depends on the geometry of the string in the large 
n limit (it seems, e.g., to be twice as large for rings encir- 
cling infinitely long cylinders than for open strings), but 
it seems to be otherwise universal. Within numerics, it is 
the same for square and triangular lattices, and for blocks 
of size 1 X n and 2xn. We conjecture that such logarith- 
mic corrections hold for any family of subsets of spin that 
scales in the limit n — > cxd , and for any 2-dimensional crit- 
ical phenomenon - or maybe even in higher dimensions. 
We suggest that this is the main open question raised by 
the present study. 

On the other hand, we showed explicitly that long 
range correlations alone do not by necessity lead to large 
excess entropies and thus to strong deviations from ex- 
tensivity. Even when the correlations diverge at the crit- 
ical point, the Ising model is still Markovian, as best seen 
from the Markovian structure of the transfer matrix. If 
all relevant degrees of freedom are explicit, then all Mis 
between two regions are bounded by the entropy of the 
interface between them. In the case of a long strip of 
finite width, the excess entropy is thus bounded by the 
width and cannot diverge at the critical point. It is only 
when one considers long strings of spins embedded in a 
large background which is not treated explicitly, that di- 
verging mutual entropies can occur. 



This remark is also relevant for the holographic prin- 
ciple in classical spin systems [25]. In contrast to the 
original holographic principle for black holes, where the 
entropy is given by the surrounding area [26 , the holo- 
graphic principle in statistical mechanics stipulates that 
the mutual information between a finite (sub-)system and 
its environment is bounded by the interface area. In gen- 
eral one might suspect that this interface "gets fuzzy" 
at a critical point, and that its area should therefore be 
replaced by the product between the area and the cor- 
relation length [5S]. In several cases it was found that 
this is not needed, and the reason is obvious from the 
above: The relevant "thickness" of the interface is not 
given by the correlation length, but by the order of the 
Markov field - which is small for most models studied in 
statistical physics. 

Although we have studied only 2-dimensional systems 
in the present paper, we expect most results to carry over 
to higher dimensions. In particular, the excess entropy 
for a long system with finite cross section should be finite, 
while the one for a string of spins embedded in an infinite 
lattice should diverge logarithmically at the critical point. 

Let us make a last remark on the logarithmic diver- 
gence of the excess entropy for spin chains. Superficially, 
this is very similar to the behavior of self-avoiding walks 
(SAW) [IT]- For SAW in < 4 dimensions the partition 
sum (i.e. the number of distinct configurations of n-step 
walks) is given asymptotically by Z„ ~ e'^^n'^^^. Thus 
the entropy (the logarithm of the partition sum) contains 
an extensive part fin and a logarithmically diverging part 
(7 — l)lnn. The latter is universal with respect to the 
type of lattice, but depends on the topology of the SAW. 
In particular, 7 is different for open SAWs and for closed 
loops [57]. Notice, however, an important difference to 
our present problem: While 7 is defined only by averag- 
ing over all walk geometries with a given topology, the 
analogous constant b defined in Eq. (27) is defined for 
each individual geometry. 



VIII. APPENDIX A 

Let us call the 2™ x 2^ transfer matrix T. Its leading 
eigenvalue A is related to the partition sum by ~ A^. 
The corresponding right and left eigenvectors {ip) and 
are normalized such that {(plip) = 1. It is well known that 
the probability for the spin state i e { — 1, 1}™ is given by 



(29) 



Our strategy for calculating iJJ" is thus to iterate simul- 
taneously the two equations 



i 



and normalize them, until 



(30) 



-^M = ~zJ^*'*^°SPi,t with Pi,t ^ tpi,t(l^i,t (31) 
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FIG. 16. (Color online) Residues in fitting w ^H\{w, fie) - 
f{w) where the fit /(w) is a polynomial in 1 /w (see Eq. ( 32 1) 



for the 1-d Ising chain in a transverse magnetic field, using the 
data from [S] . Data are plotted against 1/w. Thus, if both the 
data and the fit have simple asymptotics, the residues should 
be smooth for small l/w and their extrapolation should pass 
through zero. The continuous curve is such an extrapolation. 



has converged to its limit H]" = limj_j.oo H]" 



Actually, we did not use in Eqs.( 30 ) the transfer matrix 
for adding an entire line, but we wrote T as a product 
over w 'partial' transfer matrices where we added one 
spin in each [28.. The advantage is that these partial 
transfer matrices are sparse (they have just two entries 
in each row), and thus the CPU time required for one 
iteration is reduced from 0(4'") to 0{Aw2'^) (one looses 
a factor 2 because the partial transfer matrices are not 
symmetric, whence both iterations of Eqs.([30| have to 
be actually done, while they would be identical if the full 
transfer matrix were used). This allowed us to obtain 
]-[w=29 jj-^ 35 CPU hours on a modern workstation. All 
calculations were done in extended (80 bits) precision, 
and it was checked that they agreed with results obtained 
with 64 bit double precision. 



IX. APPENDIX B 

In [5], the value of Tc = r(/3c) was obtained by extrapo- 
lating data for the 1-d Ising chain in a transverse field, for 
w — 16, 18, . . . 44. While the raw data (i.e. the values of 
Hi{w, /3c)) are precise up to 15 digits, we believe that the 
extrapolation for w ^ oo was not done optimally. We 
thus present here an alternative analysis that provides, 
in out opinion, a much more precise value of r(/3c). 

As in fS], our analysis is based on least-square fitting 
w^^Hi{w, j3c)) by a polynomial in l/w, 

w-^Hi{w, I3c) ~ f{w) ;= ao + ai/w + ... at/w^, (32) 

but the details of the fits are quite different. 



At a first try, we fitted all 15 values (for even w = 
16, . . . 44) with a polynomial of order 7. In order to obtain 
a good fit for large values of w, eventually at the cost of 
obtaining a bad fit for small w, we made a weighted fit, 
i.e. we minimized the weighted sum 



Xo 



E 



u;'"[w-ii?i(w,/3e)-/(w)]' 



(33) 



with a large positive value of m (in most fits we used m = 
5 to 7) . Since the coefficients Ofc are not well constrained 
by such a fit and would e.g. increase rapidly with fc, if 
there were non-analytic terms in the true f{w), we also 
added to xo a penalty term of the form A a^. Finally, 



we want f{w) to be such that f{w) 
which means that the residues 



h{f3c) for l/w ^ 0, 



w-^H,{w,l3c)- f{w) 



(34) 



should vanish for w = 0. We cannot impose this as a 
constraint, of course, since we do not know the values 
of w~^Hi(w, Pc) for large w. But we know that both 
Hi{w, /3c) and f{w) should be smooth functions, and 
thus their difference (i.e. the residues) should be also 
smooth. We thus added yet another term that controls 
the derivative of f{w) at the largest accessible value of 
w. The total cost function to be minimized is thus 

X = Xo + A^a^. -hAi[e44-e42-^]^ (35) 

k 

with X, fi and b free parameters. 
After extensive trials, we found that: 

• There is indeed no indication that w^^Hi{w, Pc) is 
not described by a power series in l/w, and even 
without any penalty the coefficients tend very 
rapidly to zero for large k. We thus put A = 
and we truncated /(w) to a fifth order polynomial. 
Higher order terms would have very small coeffi- 
cients and have virtually no effect on the estimates 
of Tc and h{f3c). 

• The power m can be fixed to m = 6. Other values 
gave very similar results. 



• The last (slope-controlling) term in Eq. ( 35 1 is cru- 
cial. We fixed /i and b such that an extrapolation of 
the residues quadratic in l/w would pass through 
eoo = (see Fig.Il6]). 



Residues obtained by a typical fit, together with their 
extrapolation to l/w — 0, are shown in Fig. 16 For all 
w > 18 they are < 10^^^. By comparison, the fits used in 
^] typically had residues ~ 10~^. The corres ponding val- 
ues of Tc and h{(3c) are given in Eqs. ( 22|23 ). The errors 



are obtained by trying different constants in Eq. ( 35 1 . 

Once Tc has been obtained in this way, we can use it 
as a constraint when estimating h{/3c) for the classical 
Ising model. In that case we found again no hint for a 
non-analytic behavior in l/w. On the other hand, the 
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FIG. 17. (Color online) Convergence of entropy estimators 
for the critical 2-d Ising model. 



coefRcients a^; did not decrease with k as fast as for the 
1-d Ising chain in a transverse field, whence we used an 
eighth order polynomial and we used a small positive 
value of A to keep the coefficients small. Again the last 
term in Eq. (35) was important. 



X. APPENDIX C 

For simulating the Ising model, we used the Wolff [T2] 
algorithm. When estimating entropies with high preci- 
sion from histograms generated by Monte Carlo simula- 
tions, one has to cope with three problems: Finite size 
corrections, transients, and finite sample corrections in 
the histograms. 

First simulations on a lattice of size 2048 x 2048 sug- 
gested important finite size corrections, whence we finally 
made simulations on lattices with L — 4096, 16384, and 



65536. As seen from Fig. 12 the latter is indeed needed, 
but is also big enough. We used helical boundary condi- 
tions. 

As concerns transient times, we started off using ran- 
dom initial configurations. In that case, our results at Tc 
were clean only when we discarded transients with ^ 10'^ 
spin flips per site. This might seem to contradict the sup- 
posed very short correlation time for the Wolff algorithm, 
but the latter is only for correlations in equilibrium, not 
for the relaxation towards it. If one starts with a random 
spin configuration, transients are dominated for a long 
time by configurations where part of the lattice has al- 
ready long range correlations, while the other part is still 
disordered. During this time the number of flipped clus- 
ters is roughly the same in both regions, but the number 
of spin flips is vastly larger in the former. Thus it takes 
very long - when time is measured in terms of spin flips - 
until the latter also gets ordered. This effect is decreased 
by starting with a finite (but not too large) magnetiza- 
tion. We found empirically that transients were shortest 



when the starting configuration was at the percolation 
threshold, i.e. when one of the first clusters to be flipped 
was a giant cluster spanning the entire lattice. 

After the transient, we made histograms with 2" en- 
tries with n — 29. Each entry corresponds to a horizontal 
or vertical line S"" of spins. This histogram was updated 
after every w 50 spin flips per site, by shifting S*" through 
all positions. The simulations were stopped when the 
histograms had Af > 5 x 10^^ entries, corresponding to 
about six days of CPU time. From any histogram with 
2" entries we can obtain histograms with 2" entries with 
n' < n corresponding to shorter strings by coarse grain- 
ing. 

In spite of the very large number M of histogram en- 
tries, there are still substantial flnite-M corrections to 
the naive entropy estimator 



Hn = — > log 

i=l 



(36) 



where is the number of entries in the i-th slot of the 
histogram, with ™i = More precisely, this esti- 
mator always underestimates the entropy, by mistaking 
statistical fluctuations for real (entropy-reducing) struc- 
ture. Therefore we used the improved estimator from 
|29) . In this estimator the logarithm is first split up into 
log mi — logM, and then the (natural) logarithms of in- 
tegers are replaced by a function G„i, where 



Go — Gi — —7 — In 2, G2n+2 — G2n + 



2n + l'' 



(37) 



and 7 = 0.577215 ... is the Euler-Mascheroni constant. 
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(38) 



Other estimators with smaller bias do exist [25] , but there 
is in general a trade-off: One can either minimize the bias 
or the statistical variance of any entropy estimator, but 
not both. Eq. (37) is constructed such that it is close 
to optimal in the case where mi <^ M for all i. The 
estimates for the block entrop y = — i?28 for a 
65536^ lattice are shown in Fig. 
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We see there that the 
naive estimator would still be substantially biased, while 
both the bias and the statistical error for the improved 
estimator are « 10~^. In Fig. 17 we also show the naive 
estimator with Miller correction l30l. 



Miller 



Nn-l 

2M '■ 



(39) 



where iV„ is the number of non-zero histogram entries. 
Although the Miller correction removes the bias to lead- 
ing order in the limit M — >■ 00, i?Miiior,n — Hn = o(l/M), 
it corrects only for about half of the bias in the present 
case. 
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